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This paper establishes a necessary and sufficient condition for 
the asymptotic normality of the nonparametric estimator of sample 
coverage proposed by Good [Biometrica 40 (1953) 237-264]. This 
new necessary and sufficient condition extends the validity of the 
asymptotic normality beyond the previously proven cases. 

1. Introduction. Suppose that a random sample of size n is drawn (with 
replacement) from a population of infinitely many species. Let Xi(n) be 
the frequency of the ith species in the sample. Let p n = {pi n ,i > 1) with 
X^i Pin = 1 and P n be probability measures under which the ith species has 
probability pi n of being sampled. The infinite sequence X(n) = (Xi(n), i > 1) 
can be viewed as a multinomial (n,p n ) vector under P n . For all integers 
m > 1 

P n {X i (n) = x i ,i = l,...,m}= ( ^ l=lPin) r. , Ui=1 . Pin ■ 

(n - xi x m )\xi\ ■ ■ -x m \ 

Let Q n be the total probability of unobserved species and Fj(n) be the 
total number of species represented j times in the sample. These random 
variables can be written as 

oo oo 

(1.1) Q n = ^2pin0~io(n), Fj(n) =yj^( ra ), <%(n) = I-pQ(n) = j}. 

i=l i=l 
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Good [10], while attributing an essential element of his proposal to A. M. 
Turing, carefully developed and studied the estimation of Q n by 

(1.2) Qn = ^. 

n 

The total proportion of the species not represented in the sample Q n and 
its estimate Q n have many interesting applications. For examples, Efron 
and Thisted [4] and Thisted and Efron [19] discuss two applications related 
to Shakespeare's general vocabulary and authorship of a poem; Good and 
Toulmin [11] and Chao [1], among many others, discuss the probability of 
discovering new species of animals in a population; and, more recently, Mao 
and Lindsay [15] study a genomic application in gene-categorization, and 
Zhang [20] considers applications to network species and data confidential- 
ity problems. In addition, many authors have written about the statistical 
properties of Q n . Among others, Harris [12, 13], Robbins [17], Starr [18], 
Hoist [14], Chao [2], Esty [5, 6, 7, 8, 9] and Chao and Lee [3] are frequently 
referenced. However, of special relevance to the issue of concern here is Esty 
[6] , in which the asymptotic distributional behavior of the coverage estimate 
under infinite dimensional probability vectors is discussed. Esty [6] gives a 
sufficient condition for the asymptotic normality of a -^/n-normalized cover- 
age estimate. More specifically, Esty [6] proved that 

(1.3) lim P n {Z n < t} = P{N(0, 1) < t}, 

n— +oo 

where 

v n(Q n - Q n ) 



n {Ki ? i(n)(l - £ n Fx(n)/n) + 2E n F 2 (n)} 1 / 2 
for all real t under the sufficient condition 
(1.4) 4Fi(n)/n-»d€(0,l), E n F 2 (n)/n c 2 > 0. 

Esty [6] also proved that (1.4) implies 

/i r\ n {Qn — Qn) D T\r(n i \ 

{ '> {F l (n)(l-F 1 (n)/n)+2F 2 (n)} 1 / 2 {,) 

under P n . 

In this paper, we extend the result of Esty [6] by establishing a necessary 
and sufficient condition for the asymptotic normality of the sample coverage. 
The family of distributions under the condition of this paper includes that 
of Esty [6] as a proper subset. 

There are three sections in the remainder of the paper. The main results 
and proofs are given in Section 2. Several examples, including a few cases 
satisfying and a few cases not satisfying the new necessary and sufficient 
condition of the paper and a genomic application, are given in Section 3. 
The proofs of several lemmas are included in the Appendix. 
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2. Main results and proofs. 

2.1. Main results. Define 

oo 

(2.1) sl n = Y J [^Pine- Xnn + {\pin?e~ Xpin l s n = s nn . 

i=l 

Since E n Fj(n) = EZi Qteni 1 ~ VinY^ and (1 - Vin ) n « e~ n ^, s 2 n is an 
approximation of E n F\(n) + 2E n F2(n) . 

Theorem 1. Let Q n = F\{n)/n be the Good estimate of sample coverage 
Q n as in (1.2) and (1.1). Let s n be as in (2.1). Suppose that 

(2.2) limsup# n Fi(ra)/n < 1. 

n— >oo 

Then, the central limit theorem (1.3) holds if and only if both 

(2.3) E n Fi(n) + E n F 2 {n) -> oo 
and the Lindeberg condition 

oo 

(2.4) s~ 2 Y,{nPin) 2 e- np ' n L{n Pm > es n ] ^0 Ve > 

i=l 

hold. In this case, (1.5) holds and 



(2.5) lim Pj 

n^oo 



Qn_ 1 
Qn 



> e \ = Ve > 0. 



Moreover, if (1.5) holds, then (2.3) and (2.4) imply each other. 

Corollary 1. // (2.2) and (2.3) hold, then (1.3), (1.5) and (2.4) are 
all equivalent. 

Remark 1. If pi n =pi do not depend on n (under a fixed probability 
measure P n = P), then E n F\(n)/n — > always holds. In this case, Esty's [6] 
theorem is not applicable. 

Remark 2. We call (2.4) the Lindeberg condition, since it is equiva- 
lent to the standard Lindeberg condition when the sample size is a Poisson 
variable with mean n. Due to 

oo 

J2( n Pin) 2 e- npin I{n Pin > M} 
i=i 

oo oo 

< M2 j+1 e~ M2J np in I{M2 j < np in < M2 j+1 } 

j=0 i=l 

= 0{l)nMe~ M 

with M — es n , the Lindeberg condition (2.4) holds if s n /logn — > oo. 
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Remark 3. We prove, in Lemma 1 below, that E n Fi{n) + 2E n F2(n) 
and are within an infinitesimal fraction of each other if one of these 
quantities are bounded away from zero. Thus, condition (2.3) holds if and 
only if — > oo . 

Remark 4. Theorem 1 is proved using Poisson approximation. The only 
case not covered is E n F\(n) / n — ► 1, where the Poisson approximation fails 
and Esty's theorem does not apply. 

Theorem 2. Suppose (2.4) holds and E n F\(n) — > c* G (0, oo). Then, 
E n F 2 (n)^0, 

E n (nQ n -c*) 2 ^0, nQ n = F 1 (n)-^N c , 
under P n , where N c * is a certain Poisson variable with mean c* . 



2.2. Poisson approximation and proofs of theorems. Suppose the popu- 
lation is sampled sequentially, so that X(m) — X(m — 1), m > 1, are i.i.d. 
multinomial (l,p n ) under P n . Define 

oo 

(2.6) Cn = ^2{Sn(n) - np in 5 i0 (n)} = n(Q n - Q n ). 

i=i 

Let N\ be a Poisson process independent of {X(m), m > 1} with E n N\ = A. 
Define 

oo 

(2.7) Cxn = Y i*n, Y iXn = 5n(N X ) - \ Pin 5 i0 (N x ). 

1=1 

Under probability P n , {Xi(N\),i > 1} are independent Poisson variables 
with means \pi n , so that {Yi\ n ,i > 1} are independent zero-mean variables 
with 

EnY? Xn = af Xn = \ Pin e- X ^ + {\p in ) 2 e~ x ^, 

(2.8) 

oo 

EnCxn = 51 a i\n = s 1n- 
i=l 

Theorem 3. Suppose A = A n — > oo. Then, 

(2.9) (xn/sXn^N(0,l), 

if and only if both s Xn — > oo and 

oo 

(2.10) s^Y,( X Pin) 2 e' Xptn I{X Pi n>es Xn }^0 Ve>0. 

i=l 
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PROOF of Theorem 3. By the Lindeberg-Feller central limit theorem, 

(2.9) holds if and only if 

(2.11) maxff? An /*L = max^ n 2 [Ap in e- Ap "> + (\ Pin ) 2 e~ Xp ™} -» 0, 
i>i i>i 

and the standard Lindeberg condition holds in the form 

00 

(2.12) s^ 2y £EY 2 Xn I{\Y iXn \ > es Xn } ^ Ve > 0. 

i=l 

Since <%(iV A ) are 0-1 variables and = Su(Nx) + (Ap in ) 2 ^ (A r A), 
2^ 1 y j 2 An /{y;i n >2(e SAn ) 2 } 

< <5 a (iV A )/{l > es An } + (\pin) 2 Si {N x )I{X Pin > es Xn }, 
which is no greater than Yj An I{|Yj An | > es Xn }. Thus, (2.12) is equivalent to 

oo 

s Xn Y^ X Pine~ Xpin I{l > es Xn } + (\p in ) 2 e~ Xp ™ I {\p in > es Xn }} 

(2.13) 

Ve>0. 

If s \n co, then (2.10) implies (2.13) immediately and (2.11) via (Xpi n ) J ' e~ Xpm < 
j!,i = l,2. 

It remains to prove that (2.11) and (2.13) together imply s Xn — > oo and 

(2.10) . In fact, (2.11) is not even needed. If s Xn < M along a subsequence, 
then, for e < 1/M, 

oo 

S 2 Xn < Y,i 2X Pin e ~ XPm + {\p m fe- XPm I{\pin > 1 > eS Xn }} 
i=l 

oo 

< 2^[Ap in e~ Ap '"/{l > es Xn } + (\p in ) 2 e- Xp ™I{\p in > es Xn }], 
i=i 

so that (2.13) fails. Thus, (2.13) implies s Xn — > oo. This completes the proof, 
since (2.13) implies (2.10) immediately. □ 

We prove Theorems 1 and 2 via Theorem 3 and the Poisson approximation 
(2.14) k^HH = 0Pn{1) . 

We need three lemmas. 

Lemma 1. (i) Let s 2 be as in (2.1). For e/n < 1/4, 

(1 - l/n)e- £ s 2 n - n 2 e~^ < E n F x {n) + 2E n F 2 (n) < e 2e s 2 n + n(n + l)e" (n - 2)£ . 
Consequently, if liminf n min{s 2 , E n F\{n) + E n Fi{n)} > 0, then 
{EnFxin) + 2E n F 2 (n)}/s 2 n 1. 
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(ii) Let s\ n and s 2 be as in (2.1). For all X' < X and e > 0, 
(2.15) (A'/A)M n < s\, n < e £ s\ n + A(l + A) exp(- X's/(X - A')). 

Consequently, s\ nn = (1 + o(l))s^ if n 2 e -en/|A„-n| = ( s 2j f or a ll e > 
and X n /n — > 1. 

Lemma 2. Xe£ (\ n be as in (2.7). Then, 

E n max \Ctn- C\n\ 
\<t<\+A 

{oo 
x 1/2 
oo 
£A Pm (l + Ap m )e- A ^(l-e- A ^) +2^Ap m e" A ^. 
i=l J i=l 

Lemma 3. //liminf n s 2 > and s^/n = o(l), then (2.14) holds. 

Proof of Theorem 2. It follows, from (1.1) and (2.4), that 2E n F 2 (n) 
is bounded by 



X> ™ j&(l-pi n ) n - 2 < £>Pi n ) 2 {(l - Pin)"" 1 +Pi„}/{np m < £S n } 



n 

i=l \ 2 / i=l 

oo 



(2.16) + X>p fa ) 2 e-( n - 2 ^»J{np in > es n } 



i=l 

\2 , / 2> 



< es n E n Fi(n) + (es n ) + o(sJ, 

so that, due to F n Fi(n) = 0(1), s 2 = 0(1) by Lemma l(i). Thus, by (2.1) 
and (2.4), 

oo oo 

(2.17) J2( n P^) 2e ~ nPm ^ J2( n Pin) 2 ^ npin Hn P in > es n } + es 3 n - 

1=1 8=1 

as n -c oo and then e 0+. Since E n 5ij(n) = Q)p{ n (l - p in ) n ~ j , (2.17) 
implies 

oo 

< E n {Fi(n) - nQ n } = £ np in {(l - p m ) n " 1 - (1 - p in ) n } 

i=i 

oo 

i=i 

so that nE n Q n — > c* . Since {(5jo(n.),i > 1} have negative correlation, (2.17) 
also implies 

oo oo 

Var„(raQ n ) < V&r(n Pin 6 i0 (n)) < ^(np in )V np " 1 °- 

i=l i=l 
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Thus, E n (nQ n - c*) 2 0. Similarly, E n F 2 {n) < (e 2 /2) YZi(nPin? e~ npin 
0. 

Let Q n = E^iPirAoW*)- By (2.17), Var n (nQ n ) = YT=i{np m ? e~ np ^ = 
o(l). By (2.17) and then Lemma l(i), nEQ n = \ZT=\ n PinZ~ nPm = 4 + °( 1 ) = 
c* + o(l). These imply nQ n = c* + op n (l). Thus, by Lemma 3, 

Fi{n) - Fi(N n ) = Cn + nQ n - (( nn + nQ n ) =£ n - ( nn + op„(l) = o Pn {l). 

Since F\{N n ) = Y^L\5ii(N n ) are independent Bernoulli variables with uni- 
formly small probabilities E n 5n{N n ) = np in e~ npin < {Y^i=i{ n PinY 'e~ npin } 1 / 2 = 
o(l), F\{n) = F\(N n ) + op n (l) converges in distribution to a Poisson variable 
with mean E n F\ (N n ) = nEQ n — ► c* . □ 

Proof of Theorem 1. Assume, without loss of generality, that 
E n Fj (n)/n -^ Cj , j = 1,2, E n F\ (n) + 2£„F 2 (n) c* , 
with ci E [0, 1), C2 G [0, 1] and c* £ [0, oo] (taking subsequence if necessary). 

Case 1. ci > 0. It follows from the theorem of Esty [6] that (1.3) holds. 
Moreover, since s n /n — > c\ + 2c2 > by Lemma l(i), (2.4) holds as in Remark 
2. Thus, (1.3), (2.3) and (2.4) all hold. 

Case 2. ci = c* = 0. Since E n Fi(n) -> and Z n < for Fi(n) = 0, 

Pn(Z n <0)>P n (F 1 (n) = 0)^l. 

Thus, (1.3) does not hold. Similarly, (1.5) does not hold. Since c* = 0, (2.3) 
does not hold. 

Case 3. c x = < c*. By (1.1), 2E n F 2 (n)/(n - 1) is bounded by 

OO 7i r OO 

J2npUl-p in r- 2 <———J2Pin(l-Pin) n - 1 + sup np{l-p) n - 2 . 

<=1 1 " M / n ti P>M/n 

Since E,~iPin(l - Pin)"" 1 = E n F x {n)jn -► ci = 0, we find E n F 2 (n)/n — = 
C2, which then implies s^/ra — > by Lemma l(i). In addition, Lemma l(i) 
implies {£ n Fi(n) + 2£ n F 2 (n)}/s2 1, so that 4 -> c* > 0. Thus, (2.14) 
holds by Lemma 3, and (1.3) holds if and only if ( nn /s n — > N(0, 1) in view 
of (2.6). Therefore, by Theorem 3 with A = n, (1.3) holds if and only if both 
(2.3) and (2.4) hold. 

We have proved the first assertion of the theorem, since (1.3) holds if and 
only if both (2.3) and (2.4) hold in all the three cases. It remains to prove 
that (1.3) implies (1.5) and (2.5), and that (2.3) and (2.4) are equivalent 
under (1.5). 

We first prove the equivalence of (1.3) and (1.5) under (2.3). For fixed 
(j,n), 5ij(n) are Bernoulli variables with Cov n (5ij(n), 5^ j(n)) < 0, so that 
Var n (F i (n)) < E n Fj(n) and 

Var„(Fi(n) + 2F 2 (n)) < 2{E n F 1 (n) + 4E n F 2 (n)}. 
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Since E n F 1 (n) + 2E n F 2 {n) oo, {F 1 (n)+2F 2 (n)}/{E n F 1 (n)+2E n F 2 (n)} -> 
1 in P n by the above inequality. Similarly, i*\ 2 (n)/n = (l + op n (l)){£J n Fi(n)} 2 /n. 
Moreover, since {B n Fi(n)} 2 /n = (ci + o(l))£? n iq(n) with cj < 1, E n Fi(n){\- 
E n F\(n)/n} + 2F 2 (n) is of the same order as E n F\(n) + 2E n F 2 (n). Thus, 

(1.3) and (1.5) are equivalent under (2.3). 

Assume (1.3) holds. Since (2.3) holds, (1.5) holds. Since the Lindeberg 

(2.4) holds, 

(2.18) 2E n F 2 (n) = o(s n )E n F 1 (n) + o(s 2 n ) 

by (2.16). Thus, (2.3) and Lemma l(i) provide 

s 2 n = (1 + o(l)){E n F 1 (n) + 2E n F 2 (n)} = (1 + o(a n ))£ n Fi(n) + o(s 2 ) oo, 

which implies s n = o(l)E n F\(n) and Var n (Fi(n)) < E n Fi(ri) — > oo. Con- 
sequently, s n = op n (iq(ra)), and then, by (1.3), nQ n - nQ n = Pn (s n ) = 
op n {F\{n)) = op n (nQ n ). Thus, (1.3) implies (2.5) as well as (1.5). 

Now, we assume (1.5). If (2.3) holds, then (1.3) holds due to its equivalence 
to (1.5), so that (2.4) must hold. It remains to prove (2.3); that is, c* = oo un- 
der (2.4). Since (1.5) holds, Case 2 is ruled out, so that c* > 0. If < c* < oo, 
Lemma l(i) implies s 2 = (1 + o(l)){E n Fi(n) + 2E n F 2 (n)} = 0(1), and then 
(2.18) implies E n F 2 {n) = o(l), so that E n F\(n) — > c* . Thus, by Theorem 
2, < c* < oo would imply the convergence of y/cfZ n in distribution to 
N c * — c* and the convergence of Fi(n)(l — F\(n)/n) + 2F 2 (n) to N c * . This 
is impossible since (1.5) holds. Hence, c* = oo. □ 

3. Examples. We provide three theoretical examples and describe one 
real application. In all theoretical examples, we define pi n oc p n (i) with 
Io°Pn(x)dx = 1. The density functions p n (x) are decreasing in x > and 
sufficiently regular to allow the following approximations within an infinites- 
imal fraction: 

EnF^n)^ / np n {x)e- nMx) dx, 

(3.1) 



s 2 







np n {x){\ + np n (x)}e n P™( x ) dx. 



Example 1 (Fixed discrete Paretos). In this example, Theorem 1 pro- 
vides the asymptotic normality, but the Esty's [6] condition E n Fi(n)/n—> 
c\ G (0, 1) does not hold. Let p n (x) =p(x) = a/(x + l) b with a > and b > 1. 
Condition (2.2) is satisfied, since EnFi(n)/n ~ / °° p(x)e~ n P^ dx 0. For 
large n, changing variable t = np(x) = na/(x + l) b yields 



; , (nn\^/b roc 

te-'dina/t) 1 /^^— t^e^dt ocn l l\ 



E n F 1 (n) 

10 o JO 
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so that (2.3) holds and s n /logn — > oo by Lemma l(i). It follows that (2.4) 
holds by Remark 2. Thus, the central limit theorems (1.3) and (1.5) both 
hold by Theorem 1. 

Example 2 (Dynamic discrete exponentials). In this example, (2.3) and 
(2.4) are equivalent. Let p n (x) = a~ 1 e~ x ^ a " with a n /n < M < oo. Let t = 
np n (x). By (3.1), 

^" Fl(n) ^n" 1 P' an te- t d(a n \ogt)= /V^dp < 1, 
n Jo Jo 

so that (2.2) holds. Similarly, si ^ a n j^ /an {l + *}e - * dt by (3.1), so that s 2 n 
is of the order a n . Moreover, the Lindeberg condition (2.4) is equivalent to 

o(l) = — / _{n Pn (x)} 2 e- np ^ x Ux= [ te^dt, 

which holds if and only if s n ~ a n — > oo, if and only if (2.3) holds by Lemma 
l(i). 

Example 3 (Dynamic two-step functions). This example demonstrates 
that the three conditions of Theorem 1 are not redundant. Let a, n — > oo and 
wm + w 2n = 1 with w ln /a ln > w 2 n/a 2 n > 0. Set p n (x) = J2j=l Wj n aj,,ll{0 < 
(-l) j (x-a ln )<a jn }. By (3.1), 

2 

E n F\ (n) « n ^ Wj n e~ bjn , 
3=1 

2 

s l ~ n^2w jn (l + b jn )e~ bjn , b jn = nw jn /a jn . 

Moreover, the Lindeberg condition (2.4) holds if and only if 

n 2 

— w jn b jn e~ bjn I{b jn > es n } Ve > 0. 

Sn j=1 

Case 1. wi n = 1 and b\ n -f* 0. The p n (x) are uniform densities in (0,ai n ). 
Condition (2.2) holds, since E n Fi(n)/n ~ e~ bln 1. Since 1 + b\ n is of the 
same order as b\ n , (2.4) holds if and only if b\ n j s n — > 0, so that (2.4) implies 
(2.3). Let bi n = logn — log log n. We find s n as (1 + 6i n )logre « 6f n — ► oo. 
Thus, both (2.2) and (2.3) hold but (2.4) does not. 

Case 2. w\ n = 1 and &i„ — > 0. Thep n (x) are still uniform. Since E n F\(n)/n ~ 
e -6i» i 5 (2.2) does not hold. On the other hand, s n « n(l + 6 ln )e" 6ln — ► oo 
and b ln /s n -> 0. Thus, both (2.3) and (2.4) hold but (2.2) does not. 
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Case 3. w\ n = (1 — 1/n), b\ n = 21ogn and 6 2n —> 0. Since E n Fi(n)/n = 
o(l) and si = o(l) + nw 2n (l + o(l)) -> 1, both (2.2) and (2.4) hold but (2.3) 
does not. 

Example 4 (A genomic application). Mao and Lindsay [15] studied 
a gene expression problem based on a sample of n = 2568 expressed se- 
quence tags from a tomato flower cDNA library. The data came from the 
Institute for Genomic Research. Detailed description of the data set may 
also be found in Quackenbush et al. [16]. In this context, Q n is the proba- 
bility that the next randomly selected expressed sequence tag will stand 
for a new gene. A quantification of Q n will then be an informative in- 
dicator pertaining to the depth of the sample collected thus far regard- 
ing the levels of expression of the genes in the library. For this particu- 
lar data set, n = 2568, Fi(n) = 1434, F 2 (n) = 253, F 3 (n) = 71, F 4 (n) = 33, 
F 5 (n) = 11, F 6 (n) = 6, F 7 (n) = 2, F 8 (n) = 3, F 9 (n) = 1, F 10 (n) = F u (n) = 1 
and Fi 2 (n) = Fi 3 (n) = iq 4 (n) = iq 6 (n) = F 23 (n) = F 27 (?i) = 1, resulting in 
Q n = 0.5584. By (1.5), the 95% confidence interval for Q n is (0.5391, 0.5777), 
which incidentally is narrower than the 95% confidence interval produced 
by Mao and Lindsay [15], (0.529,0.580). Our confidence interval is not new, 
since it was based on an identical expression given by Esty [6]. However, we 
take a bit more comfort in such applications, in knowing that the validity 
of the confidence interval is supported by a larger family of distributions as 
a result of Theorem 1. 

Remark 5. The procedure introduced by Mao and Lindsay [15] is appli- 
cable to not only the total probability associated with nonrepresented genes 
but also that associated with genes represented with frequencies lower than 
a threshold. They took a different perspective to the problem from that of 
Esty [6] and, hence, ours. Specifically, their derivation started by directly 
assuming (Xi(n),i > 1), being independent Poisson random variables with 
means (Aj,i > 1) which is itself an i.i.d. sample from a latent distribution. 
Their results are based on an asymptotical argument with the number of 
species (genes) approaching infinity. 

APPENDIX: PROOFS OF LEMMAS 
Proof of Lemma 1. (i) Since 1 — p< e~ p , 

oo 

E n F 1 (n) + 2E n F 2 (n)=Y / {nPin(l ~ PinT' 1 + n{n - l)p 2 in {l - p in ) n ~ 2 } 

i=l 

oo 

i=l 
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oo 

<e 2e 4 + £np m (l + n )e-("- 2 ) e . 

i=l 

Since 1 -p > e~ p - p2 for < p < 1/2 and 1 -p + (n - l)p > (1 - l/n)(l - 
p) 2 (l + np), 

oo 

E n Fi{n) + 2E n F 2 {n) = £ np in (l - p ni ) n ~ 2 (l - Vm + (n- l)p ni ) 

i=l 

oo 

> (1 - 1/n) £ np in (l + 7ip m )e- np -- £ /{np 2 „ < e} 

> (l-l/n)e- £ s 2 -nV^. 
(ii) For all A' < A and e > 0, 

(A'/A)M n < ,|'n 

oo 

oo 

< e £ «L + E Wl + AK„)e- A '^/{(A - X')p in > e} 

i=l 

< e £ s\ n + A(l + A) exp ( - X'e/(X - A')). 
This gives (2.15), and the rest follows easily. □ 

Proof of Lemma 2. Let Y iXn = 5a(N x ) - Xpi n 5 i0 (N x ) be as in (2.7). 
For t > A, 

Y itn ~ Y iXn = S a (N t ) - tp in S i0 (N t ) - 5n(N x ) + Xp in S i0 (N x ) 

= 5 ll (N x ){S il (N t ) - 1} 

(A.l) + M^aHM^) - tpi n 8 i0 (N t ) + XpinSio(N x )} 

= -Y iXn I{X i (N t )>X i (N x )} 

+ S i0 (N x ){5a(N t ) - (t - X) Pin 5 i0 (N t )}. 

The above identity can be verified by checking both the cases of 5io(N x ) e 
{0, 1} and by noticing that 5 lJ (N x ){l-5 ij (N t )} = %(JV A )J{Xi(JV t ) > Xi(N x )}. 

Let Ti = min{t : Xi(N t ) > Xi(N x )}. Since {Yi Xn ,i > 1} are independent 
variables with mean zero and independent of {X.(N t ) — ~K(N x ),t > A}, by 
Doob's inequality for martingales, 

oo "1 2 

J2YiXnI{Xi(N t ) > Xi(N x )} 



E n max 

A<t<A+A 
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E n max 

A<i<A+A 



(A.2) 



■Ti<t 



<4^E n y i 2 n (A)/{X i (iV A+A ) >Xi(N x )} 



i=l 
oo 



i=l 



For the second term on the right-hand side of (A.l), we have 



E n SUp 
A<t<A+A 



J2 S io(N x ){d a (N t ) - (t- X)p in 5 l0 (N t )} 

i=l 

^EnMNx^PniXiiNx+A) > AQ(iV A )} + A Pin ) 



i=l 
oo 



<J2 e ' XPm2A Pi 



i=l 



This and (A.2) yield the conclusion in view of (A.l). □ 

Proof of Lemma 3. Let t n be the arrival time of the nth event in the 
Poisson process N x , with N tn = n. Since £ n - Ct n n = (t n - n) J2i^iPinSio(n), 
we have 

^~nu£n Cnn\ ^ > &Sn\ 

(A.3) <P n {\tn-n\ > A/2} 

+ P n l max \( n -( tn \ + (A/2)^2p in 5 i0 (n)>Es n \. 



n-A/2<t<n+A/2 



i=l 



Set A = n - A/2. Since E n S i0 (n) = (1 - p in ) n < e~ np ™ < e~ Xp "\ by Lemma 
2, 



E,, 



max |Cn - Ctn| + (A/2) Vp in <5jo( 

n-A/2<t<n+A/2 ~ 



n 



(A.4) 



< 4 E Ap in (l + \pin)e- Xpi " (1 - e" A ^ 

U=i 

oo 

+ (4 + l/2)^A Kn e- A ^. 



1/2 



i=l 



Since t n has the gamma(n, 1) distribution, E n {t n — n) 2 = n. Thus, by (A.3) 
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and (A. 4), (2.14) holds via the Markov inequality, provided that 

oo 

s- 2 ]T \p in {l + X Pin )e- Xpin (1 - e" Ap - ) - 0, 



Sn i=i 

with n - A = A = M^fn = 0(y/\) for all < M < oo. 

It remains to prove (A. 5). Since liminf n s^ > 0, s\ n /s n -> 1 by 
Lemma 1(h). Since s^/n = o(l), the second part of (A. 5) holds due to 
{A/s n )j:ZiPine- Xpm < s 2 Xn A/Xs n = 0(l)s n /^ = o{l). For the first part 
of (A.5), 

oo 

]T Xp in (l + A Pm )e" A ^(l - e" A ^) 

i=l 

oo 

< £s\ n + J2 Apm(l + Xp in )e~ Xpm I{Ap in > e} 

< £s l n + A (i + A) e ~ A£ / A < (1 + o(l)) ea 2 + o(l). 
Thus, since liminf n s^ > 0, the proof is complete. □ 
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